[This article belongs to Volume - 54, Issue - 01]
Gongcheng Kexue Yu Jishu/Advanced Engineering Science
Journal ID : AES-30-06-2022-226

Title : An Innovative Over Sampling Method For Imbalanced Data Reduction
B. Manjula, Shaheen Layaq

Abstract :

The reliability plays a vital role in the software development process. The reliability reduces if the software application consists of defects. At early stage itself each and every defect must be identified and addressed. But, in most of the cases the reliability or accuracy of application depends on the higher count. During classification process if defect are less in count (minority class) then that of no defect count (majority class) then there are ignored. This problem has occurred due to presence of imbalance. To overcome imbalance problem many data based methods are been proposed. In our work too we concentrated on data based method and proposed new method Imbalanced Data Reduction using Oversampling (IDROS). IDROS is an innovative method which reduces imbalance by overcoming the drawbacks of few proposed methods. The IDROS method considers the minority class and tries to give more important to it by adding the samples to it by which balance is done. To add new and most appropriate samples it calculates the mean from minority data set and from that obtained center data K -Nearest Neighbor (KNN) are identified where k is the user defined constraint. The difference of minority and majority count is made and that difference amounts of new samples are generated among KNN in equal ratio or proposition. The imbalance dataset software defect prediction(SDP) is considered from PRedict or Models in Software Engineering (PROMISE) repository and balancing is done with IDROS later Naive Bayes classification is done. The Performance measures (Geometric Mean, Accuracy, Precision, Recall, Specificity and F-Measure) are calculated in the form of “Mean ± Standard Deviation” by analyzing it our proposed method showed very good accuracy.